Annotated Trees and their Applications to XML Compression
نویسندگان
چکیده
Permutation based XML-conscious compressors permute the input document to improve the compression ratio and support efficiency of operations, such as queries or updates. One such compressor, XSAQCT, uses the properties of the permuted document, called an annotated tree, to these operations. This paper provides the formal background for the definition of an of D. It also provides an algorithm for creating an annotated tree for the XML document and its reverse algorithm, and discusses a measure of compressibility using an annotated tree. The theoretical and algorithm approaches are followed by the experimental results showing compressibility of annotated trees and a general analysis of semi-structured data and XML compression.
منابع مشابه
XML tree structure compression using RePair
XML tree structures can conveniently be represented using ordered unranked trees. Due to the repetitiveness of XML markup these trees can be compressed effectively using dictionary-based methods, such as minimal directed acyclic graphs (DAGs) or straight-line context-free (SLCF) tree grammars. While minimal SLCF tree grammars are in general smaller than minimal DAGs, they cannot be computed in ...
متن کاملP´olya Urn Models and Connections to Random Trees: A Review
This paper reviews P´olya urn models and their connection to random trees. Basic results are presented, together with proofs that underly the historical evolution of the accompanying thought process. Extensions and generalizations are given according to chronology: • P´olya-Eggenberger’s urn • Bernard Friedman’s urn • Generalized P´olya urns • Extended urn schemes • Invertible urn schemes ...
متن کاملAutomized Generation of Typed Syntax Trees via XML
The XANTLR/TDOM project is an implementation of a “typed” XML[2] Document Object Model initially used to represent abstract syntax trees in a compiler project. Tree classes, SAX event receivers, visitor classes and DTD are automatically derived from a sparsely annotated ANTLR grammar. Mapping tag values onto the type system of the target language allows for the compilation of syntax, mostly yie...
متن کاملCompressing and Filtering XML Streams
Information technology is widely adopting the use of XML for information exchange. As messaging standards migrate to XML, there is growing concern for the magnitude of messages compared to binary formatted messages. XML compression can help mitigate the risk of exceeding the capacity of current communication resources. However, it is critical that compression technologies do not hinder XML quer...
متن کاملUpdates on Grammar-Compressed XML Data
In this paper, we present updates on CluX, a grammar-based XML compression approach based on clustering XML sub-trees. We show that updates on CluX-compressed data can be performed faster than decompressing the data, loading it into main memory and compressing it. Furthermore, we show how to support fast multiple updates, e.g. performing 100 updates in parallel is more than 70 times faster than...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014